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Abstract 

In order to verify that English speeches produced by Ethiopian speakers fall under syllable-timed or stress-timed 
rhythm, the study tried to examine the nature of stress and rhythm in the pronunciation of Ethiopian speakers of English 
by focusing on one language group speaking Amharic as a native language. Using acoustic analysis of the speeches 
recorded from four Amharic speaking learners and two Canadian native speakers of English, comparison was made 
between pitch contours and length of speeches between speech samples of Amharic speakers with native speakers who 
are used in this study as a point of reference. The result of acoustic analysis showed that Amharic native samples 
displayed actual peaks on almost all words, taking longer time of articulation. It can be said that acoustic measures the 
study used for prosodic assessment of Ethiopian English exemplified the most occurring production tendencies of 
pronunciation that learners should give attention to. English pronunciation teaching to Ethiopians should involve the 
practice of stressing, un-stressing and rhythm to help learners improve their pronunciation from the influence of the 
syllable-timed rhythm of their mother tongue. 

Keywords: Ethiopian learners, rhythm, pronunciation teaching, pronunciation learning, Amharic learners, English as a 
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1. Introduction 

Rhythm is often associated with a kind of periodicity, or a recurrence of certain patterns of color, design, or sound at 
regular (equal) intervals of space or time. For example, in music, rhythm is usually produced by making a certain kind 
of beat in a sequence standing out from others by being louder, longer, or higher at equal intervals of time (Roach, 
2001). As Roach (Roach, 2002: 67; Roach, 2001: 36) puts it, rhythm in language, likewise, refers to the periodic 
recurrence of certain patterns of sound in utterances, “...syllables take the place of musical notes or beats, and in many 
languages the stressed syllables determine the rhythm”. 

It has been claimed that in some languages of the world, syllables constituting utterances, whether accented or 
unaccented, tend to occur at equal time intervals (Batibo, 2000; Dalton and Seidlhofer, 1994; Jenkins, 2000; Roach, 
2001). The time taken from one accented or stressed syllable to the next will be in proportion to the number of 
unaccented syllables between them. Such languages are said to have syllable-timed rhythm. Some other languages of 
the world, on the other hand, have stress-timed rhythm. In these languages, accented syllables have a tendency to occur 
at approximately equal intervals of time, irrespective of the number of unaccented syllables intervening between one 
accented syllable and the next. 

According to this theory, English, for example, belongs to the second category of languages and has stress-timed 
rhythm (Roach, 2002: 36). This would mean that, in English utterances, accented syllables tend to occur at 
approximately equal intervals of time. On the other hand, “unstressed syllables between the stressed syllables are 
squeezed into the time available, with the result that they may become very short” (Roach, 2001: 36). 

To see the performance of rhythm more closely and its relationship with accent as well, it is important to examine 
‘rhythm units’ (O’Conner, 1980: 90-100). It is noted that when groups of words are spoken continuously, a sort of 
break or pause occurs after a group, but not during it (O’Connor, 1980). Similarly, Roach (2002: 52), explains it as tone 
units that “... continuous speech can be broken up into units called tone units [emphasis original], and that each of these 
will have one syllable that can be identified as the most prominent”. Within each word group or tone units, there is at 
least one stressed syllable. These stressed syllables in a group may have one or more unstressed syllables before them, 
and these unstressed syllables are said very quickly to make them short. Meanwhile, the stressed syllable in a group 
may be followed by one or more unstressed syllable. However, these unstressed syllables ‘are not said specially 
quickly, rather share the amount of time which a single stressed syllable would have’ (O’Connor, 1980: 96). For 
example, English words “nine”, “ninety”, “ninetieth” all take about the same time to say “nine”; so do these sentences 
such as “I am here”, “I was here”, and “I was in here” in such a way that ‘the unstressed syllables are all very short, as 
short as you can make them.’ (O’Connor, 1980: 96). 
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In his explanation of the fundamental rule of stress-timedness, O’Conner (1980: 98) says, ‘each stress group within a 
word group is given the same amount of time’ (i.e. stressed syllables together with any unstressed syllables which may 
follow it form a stress group). For example, in a sentence “both of them left early”, ‘both of them’ is one stress group, 
‘left’ is another and ‘early’ is another; all taking the same amount of time. 

On the other hand, in cases of unstressed syllables before the stressed ones as in “I am going home”, for example, it is 
noted that there are two stress groups ‘going’ and ‘home’. The first syllable ‘I am’ does not belong to any stress group 
since it comes before the stress, and it is said very quickly, quicker than the unstressed syllable in the stress groups 
(O’Connor, 1980). This pattern contributes to the stress-timedness of English as described by O’Conner (1980:99) as 
follows: “In this sort of arrangement, any unstressed syllable before the stressed syllable is said very quickly and 
doesn’t affect the length of syllables before it”. 

It is the unit of this kind, with a stressed syllable as its center followed or preceded by any unstressed syllable is called 
rhythm unit (ibid). For example, according to O’Connor (1980), in “I am going home for Christmas”, there are three 
rhythm units: ‘I am going’, ‘home’, and ‘for Christmas’, each having stressed syllables on ‘going’, ‘home’, and 
‘Christmas’ respectively (O’Connor, 1980: 90). 

Not all languages have similar rhythmic pattern because some of the world languages have syllable-timed rhythm 
(Roach, 2002: 67; Roach, 2001:37). Phoneticians have already claimed that English is of stress-timed and therefore 
learners of English as a second or foreign language who belong to syllable-timed rhythm should learn those patterns of 
English pronunciation, which are foreign to their native language rhythm. In Ethiopia, for example, there are more than 
80 local languages and Ethiopian learners of English may speak any of these languages as a mother tongue. However, 
there seems to be no much work done available that provides the rhythmic patterns of these languages particularly with 
regard to the world’s two broad dichotomy: stress-timedness and syllable-timedness. For instance, the pioneer works on 
Amharic grammar by Baye (2000) and Getahun (1990) addressed the phonology of Amharic with predominant focus on 
its segmental aspects while no considerable mention was made on the intonation and rhythm aspects. 

If we should help learners in Ethiopia to improve their pronunciation in English, one thing we should do with respect to 
rhythm is to identify whether or not the learners’ mother tongue has the same rhythmic pattern or different with English. 
Because Ethiopia is a multilingual country, we cannot deal with all of them in one study like the present one. Therefore, 
this study only addresses the issue in terms of one language group named ‘ Amharic The selection of Amharic in this 
study is purposive as the present researcher speaks Amharic as a first language. 

The question of whether Amharic is syllable-timed or stress-timed came to the researcher’s mind when conducting his 
PhD dissertation 4 years ago while investigating Amharic Speakers’ intelligibility of spoken English to native English 
speakers (Anegagregn, 2012). The study was concerned with unintelligibility estimates and recommended future 
researches closely investigate what it is that may facilitate or debilitate intelligibility in spoken English between 
Ethiopians and other groups of speakers (i.e. both native and non-native English speakers). 

Previous studies on English as a foreign/second language pronunciation have often taken into account learners’ mother 
tongue phonological differences with target language phonology in their quest towards exploring difficulty areas in 
English pronunciation. In order to identify problem areas of English pronunciation for foreign/second language learners 
and help learners pay attention to these problems in their learning and become familiar, contrastive analysis between the 
phonology of English and that of the learners’ mother tongue has been one research area for both practitioner teachers 
and theoreticians. In this regard, a couple of studies in Ethiopia contrasted the phonology of Oromipha and Amharic 
with that of English (Anegagregn, 2014; Italo, 1988). Anegagregn (2014) for example contrasted both segmental and 
suprasegmental aspects of English and Amharic and identified possible difficulty areas of English pronunciation for 
Amharic learners. Among other factors, stress is found to be one of the typical aspects of English pronunciation that 
Anegagregn predicted as potentially the most important problematic area of English pronunciation for Amharic 
speaking learners. 

Whether Amharic is syllable-timed or stress-timed in its rhythm is not, however, verified by previous studies. 
Anegagregn (2014) for instance left the issue of the Amharic rhythm unanswered with mere speculation that Amharic is 
syllable-timed if it is not stress-timed. Such definition of the rhythm of the world’s language is common in the literature 
as depicted by O’Connor (1980) as well that ‘everything non-stress-timed is syllable-timed’. However, these claims 
should also be verified through actual production data taken from speakers. Accordingly, this study explores the nature 
of stress and rhythm in the speeches produced by Ethiopian learners speaking Amharic as first language, and verifies 
whether Amharic is stress-timed or syllable-timed. 

2. Method 

Using speech analyzer software, called PRAAT (6.0.20), the study employed acoustic analysis of the speeches recorded 
from four Amharic speaking learners and two native speakers (two females and two males) and two native speakers 
(one female and one male) of English. All participants took part in this study voluntarily. A read aloud technique was 
used for recording what the participants were asked to read aloud, which was the sentence, ‘You have to be so early if 
you want to find a parking place’. Native speakers were used in this study not as a parameter and a goal to aim at but 
only as a model to compare with the rhythmic patterns of English. 
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Utterances by females were digitally filtered of 300 Hz pitch ceiling and 100 Hz pitch roof while those by males at 250 
Hz and 70Hz pitch ceiling and roof respectively. Gender specific range settings prior to analyzing the voice samples are 
mostly used in previous studies for the efficiency and speed of acoustic measure (Abebayehu, 2007; Nagamine, 2002). 
Such pitch floor settings dictate that sounds in a speech sample below or above this frequency will be ignored. Research 
involving prosodic assessments mostly used gender specific low-pass filtering technique (70/100 Hz roof - 250/300Hz 
ceiling) that removed most of the segmental information from the signal, while leaving rhythmic and intonational 
features largely intact (Nagamine, 2002). 

Acoustically, stress or accent features are detected by the change of pitch level or pitch prominence while intonation 
consists of the occurrence of recurring pitch patterns (Gimson, 1980; Roach, 2001). The acoustic correlate of pitch is 
fundamental frequency Fo measured in cycles per second and represented in Hz (Hertz). Hence, Fo measures of the 
pitch of each syllable and Fo shapes displayed in the PRAAT picture window were utilized to investigate the sample 
speeches stress and rhythm tendencies. Besides, visible pitch contour displayed in PRAAT were also employed for the 
analysis. 

3. Results and Discussion 

Pitch prominence of the sample speeches was detected in ‘draw visible pitch contour’ window where the point of time, 
syllable and word which received the highest peak or pitch prominence is shown. By pointing the syllable of each word 
where the highest peak or shape is shown, each utterance was therefore analyzed as to the respective words where 
change of pitch level or prominence occurred. In other words, the highest peaks across the contour showed those words 
where syllables were accented (Gimson, 1980). 

Both native speakers tended to segment their speech into five syntactic groups as ‘ you have to be/so early/ if you want 
/to find/ a parking place’. As can be seen in the natives’ visible pitch contour below, both native speakers showed a 
falling pitch shape at the end of each unit or segment; the direction of their pitch changed downwards somewhere at the 
words of ‘be’, ‘early’, ‘want’, ‘find’, and ‘place’ which received stresses. This pattern of segmenting or dividing an 
utterance or longer string of speech is common in natives’ speech and is known to facilitate listeners’ ease of processing 
and interpreting information (O’Connor, 1980). Those words under the same group or segment are called tone groups or 
information units. 

As displayed in the following figures, both native English speakers showed gradual fall to the lowest point at the last 
tone group on the last word ‘ place ’ probably to mark the end of their speech. Meanwhile, both native speakers tended to 
show pauses of approximately equal intervals between the tone groups (0.35 sec.), and even between stressed syllables 
in each tone group (1.5 sec.). Such approximately equal interval of time across stressed syllables and between tone 
groups gave a regular and consistent rhythm to the native speakers. This specific rhythm, which is generally described 
as stress-timed rhythm, is often described as a backbone for English intonation (O’Connor, 1980). Thus, English is 
generally described as an intonation or stress-timed language. 

N ative_male_participant_NMP_ 

1.23267597 



Figure 1. Pitch contour of native participant 1 (NP1) 
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N ative_female_participant_NFP_ 
1.47121812 



Figure 2. Pitch contour of native participant 2 (NP2) 


On the other hand, unlike the native English speakers, Amharic native samples displayed actual peaks on almost all 
words in the sentence. Each of the syllables in the words was clearly or loudly audible taking adequate and 
approximately equal time in their articulation. As a result, actual peak was shown at almost all syllables throughout the 
utterance. In other words, all words in the sentence seemed to receive stress. Such a pattern of putting stress at all 
syllables is not common in English speech as demonstrated by the native speaker participants. 


As compared to the native speakers who uttered the unstressed syllables very fast, the Amharic native samples took 
longer period of time (on average, 4.5 seconds for their utterance) than that of natives who took an average of 2.5 
seconds. One possible reason for this may be that the native samples uttered the unstressed syllables very fast while the 
Amharic native speakers took equal length of time on all syllables. 


Amharic_native_female_participant 

1.74097124 



Figure 3. Pitch contour of Amharic native participant 1 (ANP1) 


Some variations were also observed in the tendency of segmenting the speech into tone or information units. For 
example, the Amharic speaker presented above divided her speech into three segments as ‘you have to be so early/ if 
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you want/ to find a parking place’, giving her utterance a bit of inconsistency with the number of words, the length of 
time, and the syntactic structure of the units as well. 

Similarly, more inconsistent patterns of segmentation and contour shapes were observed in the other three Amharic 
speakers. One of the male speaker (ANP3), as the figures below presents, for example, spoke ‘you have to be so early if 
you want’ as one tone group with a falling pitch while ‘to find’ and ‘a parking place’ used as different tone groups. 


Amharic_native_male_participant_3 

1.79452072 



Figure 4. Pitch contour of Amharic native participant 3 (ANP3) 


Amharic_native_male_participant_2 

2.17779764 



Figure 5. Pitch contour of Amharic native participant 2 (ANP2) 


5. Conclusion 

The study tried to see the nature of stress and rhythm in the speeches produced by Amharic speaking learners of English 
and compared it to the speeches produced by native English speakers. Speech samples were taken from speech 
productions of the Amharic speakers and native English speakers collected through read aloud speech elicitation 
techniques. Speech samples derived in this way were analyzed for prosodic accounts as measured objectively as 
acoustically analyzed using PRAAT (Boersma, 2001). 
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The prosodic measures and pitch contours resulted from computer assisted acoustic analysis reveal that sample 
speeches taken from Ethiopians disclose differences in many ways when compared with the native speakers. 

One is syllables in the sample speeches produced by the Amharic speakers are assigned approximately equal amount of 
weight, regardless of whether the syllables are stressed or unstressed. As a result, Amharic speakers’ pronunciation of 
English may sound as what Gimson (1980: 40) describes, “staccato-like to the native speakers ears”. According to 
Gimson, this particular type of rhythm can adversely affect the comprehensibility of their English to the native 
speakers. On the other hand, in the speech samples representing native English speakers, no Fo measure was observed 
on those syllables such as ‘to ’, ‘if, and ‘a ’. The repeated play on the position of the sentence at these words revealed 
very fast articulation following or preceding the stressed words that receive actual peaks. The syllables receiving no 
pitch measure and said very fast are called unaccented syllables (Roach, 2001). 

The other important characteristic of Amharic speaker’ pronunciation that was found to be different from that of Native 
speakers of English was segmentation of utterances into units/groups. Not only did variation occur between Amharic 
speakers on the segmentation of the entire speech into information units but also the direction of the pitch displayed at 
the end of each group varied between units and speakers as well. The data shows that the Amharic speech is also 
segmented but lacks rhythm at equal interval/timing because of inconsistencies on the number of words, the length of 
time, and the syntactic structure of the units as well. Variation among Amharic speakers can be demonstrated from the 
sample speeches depicted in the data by one speaker (ANP3) as ‘you have to be so early if you want’ as one tone group 
with a falling pitch while ‘to find’ and ‘a parking place’ used as different tone groups; while another participant 
(ANP2) began his speech with short tone groups as ‘you have’ with a rising pitch and the next ‘to be’ with ‘fall-rise- 
fall’ pitch contour followed by a long last one of ‘early if you want to find a parking place’. Both of these speakers 
(ANP2 and ANP3), however, showed a falling contour at the end of their utterances. 

On the contrary, segmenting their speech consistently into syntactic groups, i.e. as ‘you have to be/so early/ if you want 
/to find/ a parking place’, was commonly displayed in both samples of the native speakers. Meanwhile, both native 
speakers tend to show pauses of approximately equal intervals between the tone groups (0.35 sec.), and even between 
stressed syllables in each tone group (1.5 sec.). Such approximately equal interval of time across stressed syllables and 
between tone groups gave a regular and consistent rhythm to the native speakers. 

To sum up, speech samples collected through read aloud speech elicitation techniques were acoustically analyzed using 
PRAAT for prosodic assessment. It can be generally said that, unlike that of the native speakers, the Amharic native 
speakers did not display consistent pitch contour and information unit segmentation, which probably tended to show a 
different intonation and rhythmic pattern across each utterance. Besides, each showed pitch contours different from the 
native ones who showed approximately similar and consistent pattern. In addition, the time interval between pauses 
were so irregular and longer which sometimes make the utterance, as for example shown in two of the Amharic native 
speakers (ANP1 and ANP2), full of interruption or hesitation. 

On the whole, then, the acoustic analysis of Fo measures, actual peaks and pitch contours between native English and 
native Amharic speakers provided basic prosodic features of each utterance. Particularly, those patterns of stress, 
rhythm, and intonation of the samples were objectively measured showing the speakers’ characteristic tendencies of 
their prosody. The result generally revealed that the Amharic native samples showed no similarity with stress, 
intonation, and rhythmic patterns as shown in the native speakers. It must be noted here however that the acoustic 
differences on the part of the Amharic natives’ speech samples from that of native English speakers were only used as 
indices of stress and rhythm in Ethiopian speeches but not as goals to be aimed at. 

The findings have important implications for teaching and learning of languages. One point to note here is that learners 
of English as a second or foreign language, particularly those from languages of syllable-timed background, are 
expected to face some difficulty in English rhythm. Phoneticians, therefore, suggest that adequate practice in this area is 
very important to the learner, and to ignore it would be to neglect a vital aspect of English pronunciation (Dalton and 
Seidlhofer, 1994; Jenkins, 2000; Kenworthy, 1987; O’Conner, 1980). Almost all writers put forward that not following 
the rhythmic pattern of English would make one’s speech difficult to follow and to understand as well, which may lead 
to loss of intelligibility. 

It is true that same first language speakers have developed a shared background be it cultural or linguistic, and 
therefore, the rules regarding their pronunciation in general and rhythm in particular is common. So understanding 
between interlocutors may not be difficult. However, in an international context, speakers do not share the same 
knowledge and thus communication become complex. In situations like in Ethiopia where Ethiopians are likely to 
communicate with native English speakers (or proficient non-native speakers), not only does their speeches sound 
‘staccato’ but also they will not comprehend what is actually conveyed to them. Therefore, to communicate effectively 
at international contexts, Ethiopians as speakers and listeners are expected to work up on some flexibility and awareness 
on the stress-timed nature of English pronunciation. 

The importance of rhythm and stress in Ethiopian Teaching English as a Foreign Language (TEFL) program should not 
be ignored. Students should be able to develop sensitivity to English stress and rhythm while teachers should be 
prepared to help their students to accomplish better performance in an international environment. Meanwhile, some 
feasible ways of training on stress and rhythm should be in place. 
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